Batched Bandit Problems

نویسندگان

  • Vianney Perchet
  • Philippe Rigollet
  • Sylvain Chassang
  • Erik Snowberg
چکیده

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy that operates under this contraint and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MAGMA Batched: A Batched BLAS Approach for Small Matrix Factorizations and Applications on GPUs

A particularly challenging class of problems arising in many applications, called batched problems, involves linear algebra operations on many small-sized matrices. We proposed and designed batched BLAS (Basic Linear Algebra Subroutines), Level-2 GEMV and Level-3 GEMM, to solve them. We illustrate how to optimize batched GEMV and GEMM to assist batched advance factorization (e.g. bi-diagonaliza...

متن کامل

Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices

A challenging class of problems arising in many GPU applications, called batched problems, involves linear algebra operations on many small-sized matrices. We designed batched BLAS (Basic Linear Algebra Subroutines) routines, and in particular the Level-2 BLAS GEMV and the Level-3 BLAS GEMM routines, to solve them. We proposed device functions and big-tile settings in our batched BLAS design. W...

متن کامل

Design of a Hybrid Genetic Algorithm for Parallel Machines Scheduling to Minimize Job Tardiness and Machine Deteriorating Costs with Deteriorating Jobs in a Batched Delivery System

This paper studies the parallel machine scheduling problem subject to machine and job deterioration in a batched delivery system. By the machine deterioration effect, we mean that each machine deteriorates over time, at a different rate. Moreover, job processing times are increasing functions of their starting times and follow a simple linear deterioration. The objective functions are minimizin...

متن کامل

Batched bin packing

We introduce and study the batched bin packing problem (BBPP), a bin packing problem in which items become available for packing incrementally, one batch at a time. A batched algorithm must pack a batch before the next batch becomes known. A batch may contain several items; the special case when each batch consists of merely one item is the well-studied on-line bin packing problem. We obtain lo...

متن کامل

Policy Evaluation and Optimization with Continuous Treatments

We study the problem of policy evaluation and learning from batched contextual bandit data when treatments are continuous, going beyond previous work on discrete treatments. Previous work for discrete treatment/action spaces focuses on inverse probability weighting (IPW) and doubly robust (DR) methods that use a rejection sampling approach for evaluation and the equivalent weighted classificati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015